-
Notifications
You must be signed in to change notification settings - Fork 833
Type subsumption cache: handle unsolved type vars #19040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
❗ Release notes required
|
|
Yet another bug. This needs a careful look. I need to run some benchmarks again, too and compare standalone builds with the original Vlad's implementation. |
|
@majocha: It would equally help if two typars are both unsolved, but strip down to the same form. Wasn't the mistake more the fact that solved/unsolved form of the same TType_var had always the same key (based on Stamp) ? This is definitely a safe fix, but I wonder if it doesn't too much limit the potential of the cache for typars. |
Yes, that makes sense. It makes me wonder if we could also handle unsolved nullness better. |
|
Nullness is not stamped, so in the case of unresolved/not yet resolved/ nullness, there is no identifier to hold onto. |
|
This fix is accidental, the real problem is here: fsharp/src/Compiler/Utilities/TypeHashing.fs Line 480 in 9670f60
This was intended to speed up things by weakly attaching the computed TypeStructures to their respective TTypes. We cannot do it for not fully solved types, they still mutate and things get outdated.
|
Yes, in fact in case of typars we don't want the stamp as identity at all. We should use the structure of the solution (TType) if available or a common token for unsolved. |
Co-authored-by: Brian Rourke Boll <[email protected]>
|
I'm testing this in the IDE with the notorious OpenTK 5.0. It improved things significantly, there are way less cache entries now but much more hits, resulting in 99% ratio with little memory use and no constant churn during edits: |
| let rec private accumulateTypar (typar: Typar) = | ||
| seq { | ||
| match typar.Solution with | ||
| | Some ty -> yield! accumulateTType ty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the input going into the cache is already stripped, right?
i.e. we should not be getting long chains of solution pointers for something which is solved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah since this is operating solely on stripped types, if we encounter a type var, it should never be a solved one. In theory.
|
The huge mistake that causes a lot of inefficiency was emitting typar stamps as part of the cache key. This causes a lot of equivalent but really unusable keys polluting the cache. The increased churn is especially heavy on the IDE. I cannot see a solution to this, that would guarantee soundness. |
What about skipping cache for them? Of course it somewhat limits the genericity of the cache, and is special-tailored to the place of application. The typesubsumption was really needed for concrete types, with super types and interface hierarchies. I was thinking of content-based hashing, but you would need to hash all type constraints content-wise as they are mutable as well. Maybe the inherent mutation for solutions and constraints of a type var are good enough reasons to treat them differently? |
|
So, basically, we care only about caching I wonder if the constraints are really a problem wrt to type subsumption cache. Because the current situation where we emit only a stamp as typar identity worked so far ( the bug this PR tries to fix is elsewhere). A given typar can very well have its constraints mutated while the stamp stays constant, so it is already not collision free in general. It just does not collide in practice for this usage. I think it was like this since the original Vlad's implementation. Copilot when asked says: Q: which typar constraints can influence the result of TypeFeasiblySubsumesType? I wonder because I need to know what should take part in keying the TypeSubsumptionCache.
hmm. |
|
|
||
| let tokens = | ||
| emitTType env ty | ||
| |> Seq.filter (fun t -> t <> TypeToken.Nullness NullnessInfo.WithoutNull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might as well solve it at emission time (i.e. not emit a nullness token at all into the sequence), or?
|
|
||
| match r.Solution with | ||
| | Some ty -> yield! emitTType env ty | ||
| | None -> | ||
| if r.Rigidity = TyparRigidity.Rigid then | ||
| TypeToken.Rigid typarId | ||
| else | ||
| env.stable <- false | ||
| TypeToken.Unsolved typarId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A basic checklist, please check each:
- Infinity chains of typars are handled (by laziness and truncation)
- Solve typar's hash is equal to its solution's hash (after nullness token filtering)
- Solved typar's hash is not equal to unsolved typar's hash with the same stamp
- Unsolved typar's are never weakly cached on the reference itself (solved by boolean flag to reference caching)
- This is never called in parallel for the same input ttype (since the context access it not thread-safe)
- The cache is kept small by caching only top-level Ttype_app and their contents are normalized and stamp-independent
- The weak cache still makes sense, it just is not used whenever an unsolved typar is encountered. But for all other scenarios, it can avoid the
getTypeStructureOfStrippedTypecall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A basic checklist, please check each:
- Infinity chains of typars are handled (by laziness and truncation)
- Solved typar's hash is equal to its solution's hash (after nullness token filtering)
- Solved typar's hash is not equal to unsolved typar's hash with the same stamp
- Unsolved typar's are never weakly cached on the reference itself (solved by boolean flag to reference caching)
- This is never called in parallel for the same input ttype (since the context access it not thread-safe)
- The cache is kept small by caching only top-level Ttype_app and their contents are normalized and stamp-independent
- The weak cache still makes sense, it just is not used whenever an unsolved typar is encountered. But for all other scenarios, it can avoid the
getTypeStructureOfStrippedTypecall.
This one I am not entirely sure. I'm trying to informally test by building FCS net10.0 and also OpenTK 5.0, both of which have quite different profiles of cache use, OpenTK now does not benefit from weak memoization. I left it in with the editor use in mind, but now that the cache works only on TType_apps, I wonder if it still helps.
| let inline toNullnessToken (n: Nullness) = | ||
| let private emitNullness env (n: Nullness) = | ||
| match n.TryEvaluate() with | ||
| | ValueSome k -> TypeToken.Nullness k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trace based mutation+rollback might be a problem for the weakcache, see e.g. here
fsharp/src/Compiler/Checking/ConstraintSolver.fs
Lines 1045 to 1050 in 3828c87
| | Nullness.Variable nv1, _ -> | |
| trace.Exec (fun () -> nv1.Set nullness2) (fun () -> nv1.Unset()) | |
| CompleteD | |
| | _, Nullness.Variable nv2 -> | |
| trace.Exec (fun () -> nv2.Set nullness1) (fun () -> nv2.Unset()) | |
| CompleteD |
Or another one here:
| trace.Exec (fun () -> r.typar_solution <- Some ty) (fun () -> r.typar_solution <- None) |
The typestructure has this covered, but weakcache has not.
But fully clearing the weakcache when a trace's undo is called is way too defensive and would likely sacrifice a lot of perf potential :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the weakcache is not as crucial now, when we limit only to TType_app. Is nullness taken into account for TypeFeasiblySubsumesType? Maybe it's possible to not emit it at all.
(In the long run type structure generation could be made more configurable if needed, too.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this happens not just for nullness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, happens for typar solutions - e.g. method overloading, also in context of picking SRTP overloads.
If we are in a Trace environment within constraint solving, it would be safest to never add things to the WeakCache.
(alternatively add it there, but then remove at Undo - but there might be overhead in keeping track what to remove, I guess safer not to add...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as first approximation we can just not memoize when any typars come into play, solved or not. Good thing the performance is still way better than before.
|
I quickly asked copilot: Q: Does TypeFeasiblySubsumesType take nullness into account ? A: Short answer: No. The predicate is intentionally nullness-agnostic. |
Description
Fixes #19037
Added repro test case.
The problem was in memoization of
TType->TypeStructuregeneration.TTypeis inherently mutable. OneTTypeinstance can result in differentTypeFeasiblySubsumesTypeoutcomes and differentTypeStructurekey fragments, as type arguments get solved. For the purpose of type subsumption caching, we can treat as stable only fully solved types, with solved or rigid type vars. We still want the cache to use the unstable ones, but we must produce newTypeStructuresfrom theTTypesas they evolve towards a solution.Additionally, caching is now limited only to
TType_appcases.Note:
Cache keys do not take into account constraints. This seems to work in practice, because
TypeFeasiblySubsumesType, where the cache is applied, does not use them either.If this turns out to be a problem, or we ever need to use
TypeStructuresalso somewhere else, we should emit a structural fingerprint ofTypar.Constraints, too.